Goto

Collaborating Authors

 vapnik-chervonenkis dimension


Neural Network Learning and Quantum Gravity

arXiv.org Artificial Intelligence

The landscape of low-energy effective field theories stemming from string theory is too vast for a systematic exploration. However, the meadows of the string landscape may be fertile ground for the application of machine learning techniques. Employing neural network learning may allow for inferring novel, undiscovered properties that consistent theories in the landscape should possess, or checking conjectural statements about alleged characteristics thereof. The aim of this work is to describe to what extent the string landscape can be explored with neural network-based learning. Our analysis is motivated by recent studies that show that the string landscape is characterized by finiteness properties, emerging from its underlying tame, o-minimal structures. Indeed, employing these results, we illustrate that any low-energy effective theory of string theory is endowed with certain statistical learnability properties. Consequently, several learning problems therein formulated, including interpolations and multi-class classification problems, can be concretely addressed with machine learning, delivering results with sufficiently high accuracy.


On the Vapnik-Chervonenkis dimension of products of intervals in $\mathbb{R}^d$

arXiv.org Machine Learning

We study combinatorial complexity of certain classes of products of intervals in $\mathbb{R}^d$, from the point of view of Vapnik-Chervonenkis geometry. As a consequence of the obtained results, we conclude that the Vapnik-Chervonenkis dimension of the set of balls in $\ell_\infty^d$ -- which denotes $\R^d$ equipped with the sup norm -- equals $\lfloor (3d+1)/2\rfloor$.


The information-theoretic value of unlabeled data in semi-supervised learning

arXiv.org Machine Learning

We quantify the separation between the numbers of labeled examples required to learn in two settings: Settings with and without the knowledge of the distribution of the unlabeled data. More specifically, we prove a separation by $\Theta(\log n)$ multiplicative factor for the class of projections over the Boolean hypercube of dimension $n$. We prove that there is no separation for the class of all functions on domain of any size. Learning with the knowledge of the distribution (a.k.a. fixed-distribution learning) can be viewed as an idealized scenario of semi-supervised learning where the number of unlabeled data points is so great that the unlabeled distribution is known exactly. For this reason, we call the separation the value of unlabeled data.


Optimal Bounds on the VC-dimension

arXiv.org Machine Learning

The VC-dimension of a set system is a way to capture its complexity and has been a key parameter studied extensively in machine learning and geometry communities. In this paper, we resolve two longstanding open problems on bounding the VC-dimension of two fundamental set systems: $k$-fold unions/intersections of half-spaces, and the simplices set system. Among other implications, it settles an open question in machine learning that was first studied in the 1989 foundational paper of Blumer, Ehrenfeucht, Haussler and Warmuth as well as by Eisenstat and Angluin and Johnson.


Multi-distance Support Matrix Machines

arXiv.org Machine Learning

Real-world data such as digital images, MRI scans and electroencephalography signals are naturally represented as matrices with structural information. Most existing classifiers aim to capture these structures by regularizing the regression matrix to be low-rank or sparse. Some other methodologies introduce factorization technique to explore nonlinear relationships of matrix data in kernel space. In this paper, we propose a multi-distance support matrix machine (MDSMM), which provides a principled way of solving matrix classification problems. The multi-distance is introduced to capture the correlation within matrix data, by means of intrinsic information in rows and columns of input data. A complex hyperplane is established upon these values to separate distinct classes. We further study the generalization bounds for i.i.d. processes and non i.i.d. process based on both SVM and SMM classifiers. For typical hypothesis classes where matrix norms are constrained, MDSMM achieves a faster learning rate than traditional classifiers. We also provide a more general approach for samples without prior knowledge. We demonstrate the merits of the proposed method by conducting exhaustive experiments on both simulation study and a number of real-word datasets.


Book Reviews

AI Magazine

Parametric tests are only valid if the data satisfy certain assumptions. If these assumptions hold, they will, however, typically give more accurate results. The analysis of statistical learning theory has very much the flavor of a nonparametric statistical test. The weakness of pac, therefore, is that its results must hold true even in worst-case distributions. There is, however, a new twist to this story in that the more recent pacstyle results are able to take account of observed attributes of the function that has been chosen by the learner, for example, its margin on the training set.


The Vapnik-Chervonenkis dimension of cubes in $\mathbb{R}^d$

arXiv.org Machine Learning

The Vapnik-Chervonenkis (VC) dimension of a collection of subsets of a set is an important combinatorial concept in settings such as discrete geometry and machine learning. In this paper we prove that the VC dimension of the family of $d$-dimensional cubes in $\mathbb R^d$ is $\lfloor(3d+1)/2\rfloor$.


Neural Network Learning: Theoretical Foundations

AI Magazine

The scientific method aims to derive mathematical models that help us to understand and exploit phenomena, whether they be natural or human made. Machine learning, and more particularly learning with neural networks, can be viewed as just such a phenomenon. Frequently remarkable performance is obtained by training networks to perform relatively complex AI tasks. Despite this success, most practitioners would readily admit that they are far from fully understanding why and, more importantly, when the techniques can be expected to be effective. The need for a fuller theoretical analysis and understanding of their performance has been a major research objective for the last decade. Neural Network Learning: Theoretical Foundations reports on important developments that have been made toward this goal within the computational learning theory framework.


Lower Bounds on the Complexity of Approximating Continuous Functions by Sigmoidal Neural Networks

Neural Information Processing Systems

This is one of the theoretical results most frequently cited to justify the use of sigmoidal neural networks in applications. By this statement one refers to the fact that sigmoidal neural networks have been shown to be able to approximate any continuous function arbitrarily well. Numerous results in the literature have established variants of this universal approximation property by considering distinct function classes to be approximated by network architectures using different types of neural activation functions with respect to various approximation criteria, see for instance [1, 2, 3, 5, 6, 11, 12, 14, 15].


Lower Bounds on the Complexity of Approximating Continuous Functions by Sigmoidal Neural Networks

Neural Information Processing Systems

This is one of the theoretical results most frequently cited to justify the use of sigmoidal neural networks in applications. By this statement one refers to the fact that sigmoidal neural networks have been shown to be able to approximate any continuous function arbitrarily well. Numerous results in the literature have established variants of this universal approximation property by considering distinct function classes to be approximated by network architectures using different types of neural activation functions with respect to various approximation criteria, see for instance [1, 2, 3, 5, 6, 11, 12, 14, 15].